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ABS TRACT 


In this thesis, we discuss the development of the neces- 
Sary tools for the performance evaluation of a multi-backend 
database system, knewn as MDBS. The basic motivation of the 
mutiti-backend database system (MDBS) is to develop an 
architecture which spreads the work of the database systen 
among multiple backends. It is a major aim of this system 
to allow capacity growth by the use of additional disk 
drives and performance improvement by the use of additional 
backends. However, tc verify the design and implementation, 
it is necessary to test the capabliity of MDBS in capacity 
growth and performance gain. 

Three tools for the performance and capacity tests are 
investigated. The first tool is the file generation package 
Which creates test files for any artificial database. The 
second tool is the database lcad subsystem which loads the 
artificial database into MDBS. The third tool is the 
request generation package. This package creates test 
reguests to query MDBS. 

The following methodology is used te create an effective 
tool. First, the properties of an ideal tool ars described. 
Then available existing programs are reviewed and evaluated 
to determine which program best meets the desired features. 
Lastly, the programs are upgraded to ensure that they are 
compatible with the current implementation, and meet the 
desired features. 

The main goal is to develop the necessary tools to 
generate tests in méasuring the extensibility of MDBS, i.e., 
how does MDBS perform as more backends are added? 
Performance 1s expected to improve (maintain) as the number 
(size) of the backends (database) is increased. 
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I. AN INTSODUCTION 


This chapter presents a brief review of the multi- 


backend database system (MDBS). Parse. the physical 
arrangement of MDBS is presented. This is followed by a 
presentation of the process structure of MDBS. Lastly, the 


actions taken in servicing requests, both insert and non- 
insert requests, are reviewed. References are cited for the 
interested reader in order to gain a more detailed under- 
standing cf MDBS. 


Aw THE MULTI-BACKEND DATABASE SYSTEM 


The multi-kEackend database system (MDBS) uses cne mini- 
computer as the master 9r controller, anda varying number 
of minicomputers and their disks as slaves or backends. 
MDBS is designed to provide database growth and performance 
enhancement by the addition cf identical backends. No 
Special hardware is required. The backends are configured 
in a parallel fashicn. A new kEackend may be added by simplv 
replicating the existing scftware on the new backend, thus 
avoiding reprogramming 2fforts. A prototype MDBS has been 
completed in order to carry out the design verification and 
performance evaluation develored in [Ref. 1] and [Ref. 2]. 
The implementation efforts are described in [Ref. 3] through 
(Ref. 5]. 

The equipmert configuraticn of the system is shown in 
Figure 1.1. The host computer is connected t> MDBS through 
the controller. The backends are connected to the 
centroller through a broadcast bus. When the controller 
receives a request frem the host, it delivers the request 
to all tkackends simultaneously over the broadcast bus. 
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Figure 


HARDWARE CONFIGURATION OF MDBS. 


1.1 
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Since «he data is distributed across all backends, all back- 
ends can execute a@ request in parallel. 

The division cf labor between the controller and the 
backends is illustrated through the process structure of 
Figure 1.2. The MDBS controller handles three functions. 
The request preparation function prepares a request for 
transmission to the backends. The insert information gener- 


ation function precesses the insert requests which require 
additional information used by the backends. The post 
processing function handles the work necessary when the 
replies are returned to the controller from the backends but 
before reaching tke host. 

The backends in MDBS carry out three different func- 
tions. The directory management function performs 
descriptor search, cluster search, address generation, and 
directory table maintenance. The record processing function 
performs record storage, record retrieval, record selection, 
and attrikute-value extraction of the retrieved records. 
The concurrency control function performs operations to 
ensure that the concurrent and interleaved execution of the 
user requests will keep the database consistent. 

Before proceeding to describe the sequence of actions 
required during a regquest servicing, some terminology is 
presented as a review. The smallest unit of data is a 
keyword, which is an attribute-valu2 pair. Information is 
stored in terms of records, which are made up of keywords 
and a record body. A predicate is of the form (attribute, 
relational operator, value). A query is any Boolean expres- 
Sion of predicates. Records are logically grouped into 
clusters based on the attribute values and ‘the attribute- 
value ranges in the records. Internally, the values and 
value ranges are called descriptors. For the user, these 
attribute values are termed keywords. Each descriptor is 
identified by a descriptor idto save computing time and 
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Figure 1.2 PROCESS STRUCTURE OF MDBS. 
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memory space. A prespecified set of requests is referred to 


as a transaction. 


Be. REQUEST EXECUTION 


This section describes the sequence of actions taken by 
MDBS in carrying out a request. First, the insert request 
will be discussed. Then the non-insert requests will be 
described. Non-insert requests are requests for deletion, 


retrieval, or ufdate. 


ie eacttOons for fnsert Requests 


The sequence of actions for an insert request is 
shown in Figure 1.3. A request from the host machine enters 
the Request Preparaticn process. Request Preparation broad- 
casts the number of requests in the transaction to Post 
processing in crder to determine when a transaction is 
completed. Request Preparation may send an error to Post 
Processing if there is a syntax error in the request. When 
a transaction is ccompleted Post Processing sends the results 
to the hest machine. Reguest Preparation then broadcasts 
the request to Directory Management. Each backend finds the 
descriptor ids associated with the request. The backends 
then exchange descriptor id information. 

After receiving the descriptor ids from the other 
backends, Directory Management sends the cluster id to 
Insert Informaticn Generaticn. Insert Inrormation 
Generation then determines which backend is to store the 
record. The selected backend determines the address of the 
new record and stores it. The other backends discard the 
record. Finally, Record Precessing sends an action- 
completed message to Post Processing, which in turn informs 
the host. 
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SEQUENCE OF ACTICNS FOR AN INSERT REQUEST. 
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The sequence of actions for a non-insert reguest is 
shewn in Figure 1.4. The actions fcr a retrieve will be 
discussed only, since the other types cf requests are quite 
Similar. A request from the host machine enters thé Request 
Preparation process. Request Preparation sends the number 
of requests in the transaction tc Post Processing in order 
to determine when a transacticn is completed. Request 
Preparation may send an error to Post Processing if there is 
a syntax error in the request. When a transaction is 
completed, Post Frocessing sends the results to the hest 
machine. Request Preparation then broadcasts the request to 
Directory management. Each backend finds the descriptor ids 
associated with the request. The backends then exchange 
descriptor id information. paeees 

After receiving +the descriptcr ids from the other 
backends, Directory Management determines the cluster ids. 
Lastly, Directory Management determines the addresses of the 
records of the identified clusters. Record Processing gets 
the records from secondary stcrage and extracts the neces- 
sary information. If aggregate operators, for example, the 
average, are specified in the retrieve request, they are 
applied at this time. The partially aggregated values are 
sent to Pest Precessing. Post Processing sends the results 
to the host following any further aggregate operations. 

This concludes the review of MDBS. Attention is now 
turned tcward performance issues of this system in the 
following chapter. “ee 
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Figure 1.4 


SEQUENCE OF ACTIONS POR A NON-INSERT REQUEST. 
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II. PERFORMANCE EVALUATION 


A. TWO VIEWS OF PERFORMANCE MEASUREMENT 


Now that the MDBS has been described, it is reasonable 
to ask “how does one determine the performance of such a 
system?" There are two viewpcints of performance evalua- 
On. The first is the macroscopic viewpoint in which the 
key performance measurement is the relative response time. 
The second viewpoint is the microscopic viewpoint. This 
viewpoint is cecncerned with méasuring the times needed to 


perform various subtasks which are carried out in servicing 


a request. In (Ref. 6], the motivation for the macroscopic —. 


measurement is provided. Thas chapter is concerned wit 
describing the perfcrmance issués which arise when using the 
macroscopic viewpoint. Thus in testing the MDBS, the macro- 
scopic viewpoint will be used before proceeding to the 
microscopic viewpoint. 


B. CRITERIA FOR PERFORMANCE EVALUATION AND TOOL SELECTION 


As stated above, with the macroscopic viewpoint the 
key performance measurement is the relative response tims. 
That is, the concern lies mainly with the affect of various 
changes to the system mo the response time. These changes 


and therefore their relative respense times are prompted hy 
the variakles described in the following section. 
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7 Performance Issues 
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The macroscopic viewpoint is concerned with changi2g 
four categories of variables and observing their affect on 
the relative response time. These variables incluce systen 
configuration variables, clustér formation variables, 
request constructicn variables, and storage variables. 

The system configuration variables deal with the 
following questicns on how MDBS performs when: the number 
of backends remains ccnstant but the database increases, the 
database remains ccnstant and the number of backends 
increases, the number of concurrent users increases, the 
number of requests per transaction increases, and the pres- 
ence of ccncurrency ccntrol is measured against the absence 
Seecencurrency control. 

The cluster formation variables deal with the 
following guestions on how MDBS performs when: the number 
of descriptors on any attribute increases, the average size 
of clusters in the databas2 ranges over small, medium, and 
larg? size, and the number of attributes and thus the size 
of the attribute takle increases. 

The request constructicn variables deal with the 
following gquesticns on how MDBS performs when: the request 
makeup is retrieve-intensive vs. update-intensive, the 
compiexity of the query increases, the relative mix of query 


types is varied, the retrieved information is either a 
projection of the record or the whole record, “he query 
predicates are permuted, and the reguest uses either non- 


directory keywords or directory keywords. 

Lastly, the storage variables deal with the 
following questions cn how MDES performs when: the data 
placement strategy of the database changes, the tuple width 
increases, and the size of the retrieved information exceeds 


that available in the main memcry. 
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Thus it can be seen that several variables influence 
the performance of MDES. This is not an all-inclusive list. 
However, the list will serve as a basis for developing the 
desired properties of each performance tool. Each tool will 
be discussed along with its desired properties in the 
following secticns. 


C. DESIRABLE PROPERTIES OF THE TEST FILE GENERATION PACKAGE 


The purpose of the file generaticn package is to create 
an artificial database which will eventually be loaded into 
MDEBS. This is the first tool tc be used for the evaluation. 
Several parameters are likely to be varied in the light of 


the performance issues. Their desired properties are as 
follows. The input parameters to such a package may 
include: fil2 size in number cf records per file, 


attribute-value size in bytes of storage, record size in 
Number of attributes values, data types of attribut2= values, 
and database size in number of files per database. In addi- 
tion, parameters must indicate whether values of attributes 
are taken from randcm functions, or from predetermined sets, 


and whether uniqueness sf values is desired. 


D. DESIRABLE PROPERTIES OF THE DATABASE LOAD SUBSYSTEM 


The database load subsystem is responsible for taking 
the files created by the file generation package and for 
properly loading the files into MDBS. In the process of 
loading the database, the database lead subsystem must also 
create the necessary tables used in directory management. 

The database lcad subsystem must be designed so that the 
performance evaluation may utilize various cluster formation 
Variables and storage variables with minimum effort. The 
Cluster formation variables and storage variables with which 
the performance may be concerned include the following. The 
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performance may be expected to depend upon whether the 
number of descriptors (attributes) is large or small. 
Certainly, when entering a large number of descriptors 
(attributes), the chance for error in this menial task is 
great. Therefore, the ease of specifying the descriptors 
(attributes) must be guaranteed. The variation of cluster 
size may affect performance. The cluster size is a function 
of the number of descriptors, the size of the input files, 
and the values used in the attribute fields. Therefore, 
these three parameters should be entered independently. The 
data placement strategy, 1.¢., how records are distributed 


across the backends, also affects performance. While simu- 
lation studies described in (Ref. 1] and [Ref. 2] show that 
the track-splitting-with-random-placement strategy is the 


most desirable, the ability to change the placement strategy 


will provide a means cf confirming these studies. 


Ee DESIRABLE PROPERTIES OF THE REQUEST GENERATION PACKAGE 


The request generation package is concerned with 
creating and executing test requests. The request formation 
variables will ke altered by the perfcrmance evaluation team 
in this performance evaluation tool. The request formation 
Variables will kre changed in crder to vary the following: 
the percentage of the types of requests (retrieve, update, 
insert, cr delete), the percentage of aggregate operators 
(ave, max, min, sum, and count) in retrieve requests, the 
complexity of the request query (A simple query will consist 
of one to two predicates, and a complex query will consist 
of ten to fifteen predicates), the order of the predicates 
appearing in the request, and «he number of attributes to be 
prejected in the retrieve request. 
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The request generation package must also possess the 
ability to allow the following: vary the length of the 
transacticn to determine its effect on system performance, 
tag requests with user identification in order to test 
concurrency control, retrieval of a record defined over the 
null descriptor, execute a retrieve request where the entire 
cluster is stored at one backend, and compar2 the above 
performance with a retrieve request where the cluster is 
distributed across all backends. 

It is now appropriate to preceed to the details of each 
of the above three tools. In the following chapter the test 
file generation package is discussed. Chapter IV deals with 
the details of the database load subsysten, and Chapter V 
develops the test request generation package. 
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In this chapter, we discuss the test fil? generation 
package development. In the first two sections, we review 
the purpose and desired properties of the package. In the 
next two sections, we discuss how the basic program was 
selected from existing file generation tools. Finally, in 
the last two sections we discuss the upgrading of th2 
selected program and future enhancements which wili further 


aid the performance evaluation tean. 


A. THE PURPOSE 


The first set cf performance evaluation experiments will 
use test data which is generated by a program in the form as 
specified by the experimenter. This process may be viewed 
in three steps. The first step consists of defining «he 
structure cf the files to be generated. The second step 
determines where the values for the specified attributes 
will be génerated. Lastly, the files are generated and 
stcred for future use. 


Be DESIRED PROPERTIES 


The input parameters to such a package may include: 
file size in numker of records per file, attribute size in 
bytes of storage, record size in number of attribute values, 
data types of attritutes, database size in number of files 
per database, whetker values of attributes are taken from 
random functions cr are selected from predetermined sets, 


and whether unigueness of values is desired. 
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C. EXISTING PROGRABS 


Two programs were reviewed in order to determine which 
possesses the largest number of desired properties and still 
would require the least effort to ensure system compati- 
bility with the current version of MDBS. The first of the 
twe programs was originally designed in {[{Ref. 3]. The 
second was a latter attempt to Simplify the tes+ file gener- 
ation package. 


1. The Original Test File Generation Package 


In this program the test data is generated and 
steered in files. Several characteristcs of the file are 
specified ky the experimenter. Each file is given a name. 
The data in the records is specified in a fixed number of 
attribute-value pairs. The type of data in the attributes 
is integer, String, and floating-point numbers. These 
values are generated in either predetermined files, called 
sets, created by the experimenter, or are randomly generated 
by separate functions. Only a uniform distribution of the 
various data types is available. This program contains all 
of the desired properties stated above, except the abilit 
to guarantee unigueness of the records created. 


Eee ithe Shogtened Test File 


This program was written in order to reduce the 
complexity of the original test file generation package. 
Many of the features of the original program remain intact. 
TwO important differences exist. The shortened version only 
allows the use of predetermined sets of values to be used, 
therefore not allowing randomly generated values. The 
second difference is the fact that the files generated must 
be cf length of less than or equal to 10,000 records. An 


advantage of the shortened version is that it is combined 
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with the shortened database load program, which is discussed 


in the following section. 


D. SELECTION OF THE TEST FILE GENERATION PACKAGE 


The shortened version of the test file generation 
package was selected initially as the file generation tool. 
MDES is currently undérgoing a change in the version of the 
compiler used. In an attempt to keep the conversion of MDBS 
Simple, the shertened version was chosen. This version 
allowed a rapid ccnversion. Hcwever, cnly user defined sets 
of values are selected for the attribute values. DAES) Gees: 
considered a disadvantage. Perhaps the overriding consider- 
ation in the selection of the shortened version was the fact 
that its associated database load subsystem was much 
Simplier. The discussion of this subsystem is provided in 


detail in the following section. 


Ee THE UPGRADING FROCESS 


The upgrading process for the shertened version of the 
test file generaticn package was relatively simple. ene 
compiler criginally used in the implementation was an older 
version. The new version is being used by MDBS. Several 
Minor compiler differences with respect to acceptable syntax 
were rapidly fixed. 


FP. FUTURE IMPROVEMENTS 


Because the shortened version possesses all but one of 
the desired properies discussed in chapter II, only one 
future chang? is anticipated. 

Two approaches which provide the shertened version with 
the capability of randomly generating values exist. The 
first of these alternatives includes adding the functions to 
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the program with the additional user interface to select 
these as cptions. The second alternative is *9 adapt the 
original test file generation fackage to be compatible with 
the shortened database load. The task would be simplified 
by choosing the first alternative. 

This concludes the discussion of the test file genera- 
tion tool. In the following chapter, we discuss the proper- 
ties of the selected database load subsysten. 
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IV. THE DATABASE LOAD SUBSYSTEM 

In this chapter, we discuss the database load subsyst2n 
development. In the first two sections, we review the 
purpose and desired properties of the subsysten. In the 
next two sections, we discuss how the basic program was 
selected from existing database load tools. Finally, in the 
last two sections, we discuss the upgrading of the selected 
program and future enhancements which will further aid the 
performance evaluation tean. 


A. THE PURPOSE 


The database load subsystem is a software tool used to 
designate an infut source file and to create a database from 
that source file. It also allows several related files to 
be consolidated into one database if desired. ine tiles s 
phase in the database load subsystem is to define the input 
files andthe database. The second phase consists of 
constructing various directory management tables. Lastly, 
the records are distributed across the backends. 


Be. DESIRED PROPERTIES 


The database load subsystem must be designed so that the 
performance evaluation may utilize various cluster formation 
variables and storage variables with minimum effort. The 
performance may be expected to depend upon whether the 
number of descriptors (attributes) is large or small. The 
ease of specifying the descrirtors (attributes) must be 
guaranteed. The variation of cluster size may affect 
performance. The cluster size is a function of the number 
of descriptors, the size of the input files, and the values 
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used in the attribute fields. These three parameters shouid 
be entered independently. The data placement strategy, 
i.¢€., how records are distributed across the backends, also 
affects performance. The ability to change the placement 
strategy will provide ameans of confirming Simulation 


studies. 


C. EXISTING PROGRAMS 


Two database lcad subsystems were reviswed. In this 
section the merits cf both of the existing programs are 
discussed. The original datakase load subsystem is covered 
mLrst, then a shortened version of the database load 


subsystem is evaluated. 


1. Zhe Original Database Loa 
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The original database load subsystem was first 
designed at the Eteginning of the implementation stage of 
MDBS. The process is viewed as four logical phases. The 
first phase is the database definition phase, in which the 
user specifies various characteristics of @2xisting source 
files and the characteristics of the database to be created. 
The second phase is the record preparation phase, in which 
the data is read from the input files and prepared for 
loading. The third phase is the record clustering phase, in 
Which the prepared records are sorted into clusters. The 
last phase is the record and table distribution phase. This 
phase distributes the records and the directory managemert 
tables to the backends. 


@e The Shortene atabase Load Subsysten 





AS stated in Chapter II, the shortened database load 


Subsystem iS much simpler than the original database load 
Subsystem. This implementation can be viewed as two phases. 
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The first chase is the directcry table construction phase, 
in which specified database parameters ars read fron 
existing files and the directcry tanles are constructed. 
The second phase is the record distribution phase. iiecats 
phase the records are distributed to the backends by using 
insert reguests. Thus this subsystem uses currently 
existing directory management functions to load =the database 
records. 


D. THE SELECTION OF THE DATABASE LOAD SUBSYSTES 


Several disadvantages to the original database load 
program exist. Since it was created at the inception of 
MDES design, it pcssessed many system incompatibilities with 
the current version of MDBS. Once? again the large size of 
the program posed a Significant maintenance problem with 
respect to the conversion of the system to the new compiler. 
For these reasons this program was not selected. 

The shortened version of the database load subsystem was 
chosen as the basis for the database load tool. This was 
due to the fact that it used existing directory management 
code and that it was much simpler to understand and thus 
maintain. 


Ee THE UPGRADING PROCESS 


In this section, we now discuss the upgrading of the 
Shortened version of the database load subsystem. A discus- 
Sion of the ccmmunication amcng processes is presented. 
Then the changes ‘to the database load subsystem are 
discussed. 
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1. Message Passing 





In order to load the current version of MDBS, it is 
necessary to change «he database load subsystem so that it 
could communicate with the backend process of dirsctory 
Management. The database load subsystem is implemented as a 
separate process in the controller. A brief discussion of 


message passing in MDBS is presented below. 
ae Message Passing Within a Backend 


The backends are supported by PDP-11/44s running 
under RSX-11M operating system. The lnter-process- 
communication facility is the shared access to physical 
memory. Suppose process X wants to send a message to 
process Y. X will copy the message into the shared area. 
Then X tells the operating system to send the address of the 
message to process Y. When Y is ready to receive a message, 
it gets the address of the message from th2 operating 
system's queue of such addresses. Process Y then copies the 
message into its own memory space. 


b. Message Passing Within the Controller 


The MDES controller is a VAX-11/780 using the 


VMS operating system. The intesr-process communication 
facility is the mailbcx. The mailbox is a software input/ 
output device. If process X wishes to send process Ya 


message, process X first issues a send command to process 
Y's mailbcx. When process Y issues the read command on its 
mailbox it will be given the message sent by process X. The 
mailbox can queue several messages. 
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c. Message Passing Between Computers 


Communication between computers La DBS is 


achieved by using a time-division-multiplexed bus calied the 


parallel communication link (PCI). Two interface processes 
to the PCL are used in each computer. The first process, 
called put_PCL, prts the message to be sent to the other 


computers on the PCL. The seccnd process, called get_PCL, 
receives the message from the bus and ‘then passes the 
message tc the appropriate process. PCLS are presently used 
to simulate the broadcast bus and will be replaced physi- 
cally by a broadcasting bus later. 


Several directory tables exist in order to process 
requests. In this section the logical descriptions of such 
tables are discussed. This will allow some insight into 
what kind of messages must be sent during the loading of the 
database. 

The Attribute Table (AT) contains a list of the 
directory attributes and a pointer to the descriptors 
defined on these attributss. The AT is located at each 
backend. The Descriptor-to-Descriptor-Id (DDIT) Table 
contains the descriftors and their corresponding descriptor 
ids. Each section of the DDIT is associated with a direc- 
tory attribute and contains the descriptors defined on that 
meccitbute. The DDIT is located at each backend. Since 
type-C suk-descriptors are created dynamically as new 
records are inserted, the +ype-C attributes must be recorded 
in a table called the Type-C-Descriptor-Table (TCDT). The 
ment is iccated in the controller. When an insert request 
contains a record with a type-C attribute and the value of 
the attrikute does not appear ina type-C descriptor, a new 
type-C descriptor will be created by the Insert Information 
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Generation process. This process will then record the 
descriptor in the TCDT. Thus all directory attributes and 
their corresponding descriptors are sent to the backend's 
Directory Management processes. All type-Cc attributes are 
also sent to the Insert Information Generation process in 
the contrcller. 


3. Specific Upgrades 


The datakase load subsystem program was changed by 
allowing it to ccmmunicate with the backends in order to 
load the database to the backends. In order to distribute 
the directory management tables to all backends, the data- 
base load subsystem must be given its own mailbox and access 
to the directory management physical areas located in the 
backends. All of the functions which create the directory 
Management tables were moved tc the backends and appropri- 
ately placed in the directory management processes. Data 
necessary to construct these tables was passed to the back- 
ends by using messages containing codes which indicate the 
type of action to be taken. Because the backends can 
construct the tables in parallel, this did not significantly 
burden the datakase lcad process. In order to support the 
message passing ability, send and receive routines specific 
to the database lead process wer written. Figure 4.1 
lilustrates the inter-process communication involved with 
the directory table ccnstructicn phase. 

In order to load the records ints the database, 
communication teétween the request preparation process 
(located in the contreller) andthe database load subsysten 
was established. This allowed the database load subsystem 
to send the insert requests directly to request preparation. 
Thus the database load subsystem was given access to the 
request preparation mailbox. It was also necessary to send 


the Insert Information Generation process all of the type-c 
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Pigure 4.1 COMMUNICATIONS: DIRECTORY TABLE CONSTRUCTION. 
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attributes for insertion 


the inter-process 


phase. 


The following is a summary 


into the TCDT. “4.2 shows 


communication of the 


Figure 


ECCOLdma ster bution 


of the types of messages 


which were added to the database load subsystem: 


Message tyfe: 
Source: 
Destin aticn: 
Explanation: 


Message type: 
Sources 
Destination: 
Explanation: 


Message type: 
Source: 
Destination: 
Explanation: 


Message type: 
Source: 
Destination: 
Explanation: 


Message type: 
Source: 
Destination: 
Explanation: 


(1) Create AT 

Database Load (TEL) 

Directcry Management 

This message creates an AT for 


the given database name. 


(2) Add Attribute to AT 
(DBL) 
Directory Management 


Dataktase Load 


This message adds an attribute 
to the AT for the given database. 


(3) Add Descriptcr to DDIT 
Dataktase Load (UBL) 

Directory Management 

This message adds a descriptor 

to the DDIT for the given database. 


(4) Add the end cf descrifter flag 
Datatase Load (DBL) 

Directory Management 

This message adds the flag to signal 
the end of the descriptor list. 


(S) Load type-c 

Datakase Load (LBL) 

Insert Information Generation 

This message passes the type-C attribute 
to IIG for entry into the TCDT. 
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Message type: (6) Insert record 
Sources Database Load (LEL) 
Destinaticn: Request Preparation 
Explanation: This message sends the record tc be 
loaded to RP. 


Message type: (7) Responses 
Source: Directory Management and 

Insert Information Generation 

Destination: Datakase Load 

Explanation: This group of messages informs DBL of 
action that is actually carried out as 
requested by the above messages fron 
DBL. They also include error messages. | 


Thus for each of the messages (1) through (6), a type (7) 
message is sent tc the Database lead subsysten. This 
concludes the upgrading of the database load subsysten. 


F. FUTURE IMPROVEMENTS 


The database lead subsystem contains all of the desired 
properties discussed above with the exception of the ability 
to change the data placement strategy. Due t> the manner in 
which the database is loaded, this would require a change in 
the directory management process. Further research is 
required to investigate the ramifications of changing the 
directory management process. This feature should be 
delayed until the system conversion to the new compiler is 
completed. , oS 
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In this chapter, we discuss the test request generation 
package development. In the first two sections, we review 
the purpose and desired properties of the package. In the 
next two sections, we discuss how the basic program was 
selected from existing request generation tools. Finally, 
in the last two sections, we discuss the upgrading of the 
selected program and future enhancements which will further 
aid the performance evaluation tean. 


A. THE PURPOSE 


The purpose of the test request generation package is to 
provide an easy means of creating a list of test requests 
which will be executed in order to test MDBS. The package 
also aids the evaluation teamin executing the list of 
requests. The list of requests are saved ina file for 
future use, in crder to avcid regenerating the list of 
reguests. 


Be. DESIRED PROPERTIES 


Recall that the test request generation package permits 
the request formaticn variables to be altered by the evalua- 
tion tean. This allows the following to be varied: the 
percentage of the types of requests (retrievs, update, 
insert, or delete), the percentage of aggregate operators 
(ave, max, min, sum, and count) in retrieve requests, the 
complexity of the request query, the order of the predicates 
appearing in the request, and the number of attributes to be 
projected in the retrieve request, 
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The request generation package must also possess the 
moriaty tc allcw the foliowing modifications: vary the 
length of the transaction, tag requests with user identifi- 
cation, retrieve a record defined over the null descriptor, 
and execute a retrieve request in which the entire cluster 
is stored at one ktackend and ccmpare the performance with a 
request which retrieves records from a cluster which is 
stored across all tackends. 


C. EXISTING PROGRAMAS 


Two existing programs were reviewed in ord23r to salect 
the one which best fits the desired properties and is compa- 
tibile with the current versicn of MDBS. Both programs 
implement the test request generation package in the 
controller. The next section discusses version A of the 
test reguest generation package. Version A was criginally 
designed at the commencement of the implementation of MDBS. 
Version B was a later version. 


1. Version A 


Version A may be described as a package which aids 
the user in develcring a list of requests. The user is 
guided through the construction of one request at a time. 
The program ensures that the syntax is correct. The intent 
of this method is to generate a small number of requests 
Which are thoughtfully devised in order to test specific 
features cf MDBS. This program also assumes that one user 
will execute only one request at a time. The user is 
allowed the following options when using this test request 
generation package: generating a list of requests for later 
use, retrieving a list of requests to be executed in any 
order, modifying an existing list, OE executing a list of 
requests. 
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Version B is a follow-on package to Version A. Et 
therefore possesses all of the features contained in Version 
A. It should be noted that Version B adds the ability to 
use the concept of transactions. Recall that a transaction 
is a group of one cr more requests. Thus the requirement of 


executing only one request at atime is removed. 


D. THE SELECTION OF THE TEST BHEQUEST GENERATION PACKAGE 


Because Version B contains all the features of Version 
A, Version B was selected as the test request generation 
package. Because this version arrived at th2 current imple- 
mentation site cf MDBS rather late in the review of perforn- 
ance evaluation tools, many of the desired features must be 
left for future development. This does not detract from the 
usefulness of the test request generation package as it 


Stands. 


Es THE UPGRADING PROCESS 


The majority of the upgrading accomplished on the test 
request generation package consisted of ensuring that the 
syntax discrepancies due to compiler differences were 
removed. A reorganization of the file location of MDBS 
resulted in many changes te the programs. 


Fe. FOTURE IMPROVEBENTS 


Several enhancements to the request geéen2ration package 
may be desirable. Three major enhancements include the 
following: program generation of requests, simulation of 
mutitiple concurrent users, and development of a storage 
informaticn package tc aid in request selection. 
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In order to test MDBS, the test request generation 
package could be modified to ccntain a routine which gener- 
ates randcem requests. The input to such a routine would 
include parameters such as the percentage of each type of 
request to be generated and the the query complexity. Query 
complexity involves ckanging tke number cf predicates in the 
reguests. This akility would allow the evaluation team to 
easily determine which type of request is most 3fficient 
under MDBS. 


In order to evaluate the effect of concurrency 
control, MDBS must be tested while several users are using 
the system. By providing a way to link a user to the 
requests which are generated, the test request generation 
package would simulate mutiple users. This would avoid 
processing several separate files of reéguests. This would 
also result in repeatable experiments, in that the condi- 
tions resulting frem executing the ccncurrent user requests 
could be duplicated. 


3. The Stcrage Information Package 


The storage information package would allow the 
experimenter to ask specific questions about the database 
storage informaticn so that intelligent queries can be 
derived. The questions an experimenter might ask would 
includes: What descriptors are associated with a certain 
attribute? What descriptor ids define a certain cluster 
number? or Where is cluster one stored? 

This package could te implemented by sending 
messages to the k-ackends. Each message would be associated 
with a routine which walks through the directory management 
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tables and finds the appropriate information and sends it 
back to the contrcller. By evaluating the responses to the 
messages, more meaningful requests can be constructed in 


order to evaluate specific features of MDBS. 


t 
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In Chapter I, we discussed the study phase of creating 
the tools. In Chapter II, we discussed the design phase. 
The develcpment phase was outlined in Chapters III, IV, and 
V. In this chapter, we discuss the operational phase. This 
taxonomy of phases is outlined in detail in [{Ref. 8]. More 
specifically, in this chapter, we discuss the performance 
evaluation tools with respect to several software engi- 
neering principles. 


A. BASIS OF AHALYSIS 


In this section, we discuss the standards by which the 
evaluation tools are to be analyzed. The two major catego- 
tries of the analysis are the ability to meet the objectives 
stated in the design phase and the ability to neez software 
goals. The standards are described in detaii in [{Ref. 9] 
and {Ref. 10]. 

The ability to meet objectives means that the tool poss- 
esses the capabilities outlined in the design phase. These 
Capabilities were discussed in detail in Chapter II. 

The performance evaluation tools will be evaluated also 
by their ability to meet five software goals. The first 
goal is that of mcdifiability. Modifiability includes the 
properties of extensibility, consistency, maintainability, 
and modularization. The second goal is that of reliability. 
Reliability includes the properties of possessing no blatent 
errors and of possessing error recoverability. The third 
goal is simplicity. This includes ease of use and sSingle- 
ness of purpose. Efficiency is the fourth goal. A tool 
will possess this goal if it ccntains no gross inefficiency. 
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The last software goal is that of understandabilicy. 
Understandability means that the tool utilizes abstractions, 
modularity, and information hiding, and is supported with 
reasonable documentation. 


B. ANALYSIS OF THE FILE GENERATION PACKAGE 


The cbhjectives of the file generation package were 
discussed in Chapter II. The cbjective that was not net by 
this tool is the ability to indicate whether values of the 
attributes are taken from randcm functicns or predetermined 
sets of values. The random functions must be added at a 
future date. 

The file generation package meets all goals with the 
exception of efficiency. Modifiability is achieved through 
the extensive use of modularization with respect to grouping 
like operations together. Reliability has been observed in 
that no errors have existed since the operational phase. 
Simplicity is demcnstrated by using menu-driven cperations 
in the file generation package. Lastly, understandabilizy 
memeacnieved by religisus use of abstraction of data and 
Operation. The gress inefficiency in the package results 
from the use of a large array which is used to store the 
unigue records which are generated. When a large number of 
recerds are to be inserted at one time, the time to compare 
the new record against all previously generated records is 
great. This concludes the evaluation of the test file 
generation packace. 


C. ANALYSIS OF THE DATABASE LOAD SUBSYSTEM 


The okjectives cf the database load subsystem were 
discussed in Chapter II. The objective that was not met by 
this tool is the ability tc vary the data placement 
Strategy. This ability must be added at a future date. 
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The database load subsystem meets all goals with the 
exception of efficiency. Modifiability is achieved through 
the extensive use of modularization with respect to grouping 
like operations together. For instance, all of the routin?s 
to pass messages are grouped in send and cteceive nodules 
which are kept in separate files. Reliability has been 
observed in that no errors have existed since the opera- 
tional phase. Simplicity is demonstrated by using menu- 
driven operations. Lastly, understandability is achieved by 
religious use of akstraction bcth in the data and the opera- 
e2OnS. The gross inefficiency in ths package results fron 
the use of a large number of insert requests. which are sent 
one at a time to the backends. This inefficiency could be 
reduced by grcuping several insert requests into a trans- 
action and then sending the transacticn to the backends. It 
is also possible te save all type-C descriptors in «he data- 
base load sutsystem and send abi CL them to Insert 
Informa ticn Generation at the endo£f the directory <table 
loading. This concludes the evaluation of the database load 
subsystem. 


D. ANALYSIS OF THE REQUEST GENERATION PACKAGE 


The objectives of the test request generation package 
were discussed in Chapter II. The objectives that were not 
met by this tool are the following erhancements: program 
generation of requests, Simulation of muitipl2 concurrent 
users, and development of a storage informaticn package to 
aid in reguest selection. These abilities must be added at 
a future date. | 

The test request generation package meets all goals with 
the exception of possessing consistency. Modifiabiality is 
achieved through the extensive use of modularization with 


respect to grouping like operations together. For instance, 
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all of the routines which are involved with creating a 
request are divided into modules each of which handles a 
distinct aspect of the request. This goal is séén 
throughout MDBS. Reliability has been observed in that no 
errors have existed since the operational phase. Simplicity 
is demonstrated by using menu-driven cperations. Lastly, 
understandability is achieved ty religious use of abstrac- 
tion both in the data and the operations. Consistency may 
be achieved by altering the test request generation +o use 
information stored in the files generated by both the test 
file generation package and the database load subsyten. 
These files could be used for the extraction of necessary 
information instead cf prompting the user to re-enter data 
supplied earlier. It is the weakest link in establishing a 
tight performance evaluation environment. Pits 15 Luce her 
discussed in the next section. This concludes the evalua- 
tion of the database load subsysten. 


E. FUTURE DEVELOPAENTS 


The most important future development should be the 
integraticn of tke performance evaluation tools into a 
performance evaluation environment. Dieehis way, “he prop 
erty of ccensistency of the tools will be attained. That is, 
the output of one tool can be used as input to the next tool 
in the logical sequence of the ferformance evaluation 
effort. This has been achieved in the test filé generation 
package-database lcad subsytem interface. The next step 
would be to develop consistency between the database load 
Subsystem-test request generaticn package interface. 

This concludes the discussion on the analysis of the 


performance evaluation tools. 
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VII. CONCLUSIONS 

In this thesis, we have discussed the development cf the 

necessary tools for the performance evaluation of a nmuliti- 

backend database system, known as MDBS. The basic motiva- 
tion of the mutliti-backend database system (MDBS) was to 
develop an architecture which srreads the work of the data- 
base system among multiple backends. It waS a major aim of 
this system to allcw capacity growth by the use of addi- 
tional disk drives and pezformaénce improvement by che use of 
additional backends. However, to verify the design and 
implementation, it is necessary to test the capability of 
MDBS in capacity grewth and performance gain. 

Three tools for the performance and capacity tests were 
investigated. The first tccl was the file generation 
package which creates test files for any artificial data- 
base. The second tcol was the database load subsystem which 
loads the artificial database into MDBS. The third tool was 
the request generation package. This package created test 
reguests to query MDBS. 

The following methodology was used to create an effec- 
tive tool. First, the properties of an ideal tool wer 
described. Then available existing programs were reviewed 
and evaluated to determine which program best méets the 
desired features. The programs were upgraded to ensure that 
they were compatible with the current impiementation, and 
met the desired features. Lastly, the tools were analyzed 
With respect to meeting the deéesired properties and satis- 
fying several scftware engineering goals. 

The main goal was to develop the necessary tools to 
generate testS in measuring the extensibility of MDBS, i.e., 
how does MDBS perform as more backends are added? 
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Performance was expected to improve (maintain) as the number 
(size) of the tackends (database) was increased. We feel] 
that the tools developed herein will allow an easy and effi- 
cient means of measuring the extensibility of MDBS. 
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DESIGN 


generation package which is a subset of the shortened data- 
base load subsysten. 
for the 
tions. 


mai 
be 


en 


gen 


end 


SPECIFICATION OF THE TEST 


APPENDIX A 


This appendix contains the design 


h_ program () 
in 


erate () 


* This routine 


function headings and their 


The body of the functions are given in Engiish text. 


He ee KK eK HK He KK ee KK / 


/* TEST PILE ¥*/ 
/* GENERATION */ 
/* PACKAGE */ 
/* * 


DESIGN 
[BERK HEE KK EER EK KEK K/ 


FILE GENERATION PACKAGE 


The design consists of C language code 
corresponding declara- 


wgenerate() ; /*generate the records*/ 


- generates a record template 


~- qenerates/modifies sets of values for attributes 
generates descriptors 
generates records using the sets 


while ( TRUE ) 


/*ASk the user for t 


/*Take Be oprate attion#/ 


endwhile; 


generate record template */ 


gen_tmp! (); 

7* Generate descriptors 

gen_desc (); 

/* generate/mcdify sets 
gm_set() ;_ 

/* generate the records 
gen_rec() ; 

7* Icad the records * / 
b pe hae Ge 

/*do nothing*/ 
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a7 
77 
*/ 


e of operation to be performed*/ 





gen_tmpl() 
* This routine generates a record template */ 


begin 
Senar tin (MFNLength + 1); /* template-file name */ 
enat CC, pe CDE DLNTH+1), hold (MAX_FPIELDS+1), temtyp; 


1. 


tnt Kk, acti: 
FILE eae NS ec pie ti 


/* Get name cf template file */ 
/* Open template file * 
/* Get database ID from the template file*/ 
/* Write database ID to template file */ 
7* Get number of attributes 
7* Write numker of SaAEoniees to template fii2 */ 
/* Get attributes and value tyoes */° 
Boe (each attribute) 
egin 
7* Enter the attribute name*/ a 
7* Enter the value type: (s=string, i=integer) */ 
end f* end fcr */ 
/* Close template file */ 
end /* end gen_tmpl */ 


gen_desc () 


begin 
char tfin(MFNLength + 1 
char dfn(MFNLength + 1 
char attr_name(A Length 
answer(5), desc_type, val_typs, c, hold(3) ; 


7* template- ~file name */ 
/* Gescriptor-filie name */ 


— #2680 


oir. Le J, NC_attr; 
FILE *fopen(), *tmpl_fp, *desc_fp; 


/* Get the template-fi te nage */ 
/* pecs template file */. 
/* Get the name of the file for storing descriptors */ 
/* Open descriptcr file */ 
/* Read thru Databas2? ID to get */ 

7* to number of attributes 77 
/* Get number of attributes */ 
7* For each attribute get its descriptors (if applicabls) */ 
for (each attribute) 


begin 
Read attribute */ 
/* Get attribute hane =< 
/* Os value ele. for the attribute */ 
7* Ask if Se te . 
1s to ke a iirectery atime o*/ 
1£ ( answer= yes ) 


“N\ 
+ 


begin 
/*® aArite attribute Ge to descriptor Eile 7 
/* Get descriptor ce attribute */ 
Je ALT= Ee descriptor e 36 descriptor re a he gy 
if ( dese Evie == 'C! desc_type == 'c') 
me gen _ C(val_type,desc_ fp); 
= 


gen notc (val etry desc fp); 
Le rite end_of data Sy abSLe ed descriptor file */ 


end 7* end fcr 
/* Write end_of_ "S410 Symbol to descriptor file */ 
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/* Close files */ 
end/s*gen_desc*/ 


gen C(val_type,desc_ fp) 


char vai_type; 
: FILE ¥*desc_fp; 
egin 
J char lewerb(AVLength), upperb(AVLength), hold(3); 
mnt fault, k ; 


* Get upper bounds for type 'C' descriptors */ 
Oils ( PRUE ) Ue : 
begin 
/* Get upper bound */ 
1£ ( end of data) 
return; 
else 
begin : 
/* Verify He beund entry against */ 
f* attribute value type x / 
/* Write NOBOUND and ve aes bound */ 
ae to descriptor file * 
en 


end 
end /* end gen_C */ 
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gen_notC(val_type,desc_fp) 


char val_type; 
FILE *desc_Tfp; 
begin 
Char lewerb(AVLength), upperb(AVLength), hold(3); 
mae rcault, ks; 


* Get lower and upper bounds for descriptor * 
Ghile (TRUE | eS 5 Z 


begin 
7/* Get lower bound */ 
1f ( end of data) 
return; 
else 


begin 
= /* Verify lower bcund entry against */ 
7* attribute value ae j * 
Ba Write lower bound to descriptor file */ 
en 
/* Get upper bound */ 
/* eee upper bound entry against */ 
/* attrituce value EOL 244 
/* Write upper bound to descriptor fil2 */ 
end /* end while */ 
end/* end gen_notc */ 


fa : oo 
: /* 1s routine generates/modifies sets cf values. */ 
egin 
Char tfin(MFNLength + 1); /* template-filé name */ 
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char attr na 


me (ANLength + 1), answer, c, val_typée 
RoldtavLength Aue ’ tad ae 
char epee: 
int @ROmattr, K, 1; 


FILE *fopen(), *tmpl_fp; 


/* get the template-file nage */ 
/* Open template file */ 
7* Get number of attributes *#/ 
for ( each attribute ) 
begin ae 
7* Get attrikute name */ 
/* Get value type */ 
/* Choose the action to be taken on attribute 
n - generate a new set for it. 
m = ney an Send Soe Or #2 
; Ss - do nothing with 1 
Switch( answer ) 
begin 
case ‘nits; 

/f7* generat? new set */ 
en _set( val_tyre ); 
reak; 

case ‘m'; 
mod _set(val_type) ; 
break; 
case ‘st; 
breaks | 
end /* end switch */ 
end /7* end for */. 
/* Close template file */ 
end /* end gm_set */ 


m7 


gen_set (val_ty pe) 


/* This routine gener 
/*O0f values for an 


char val_type; 
1 


n 
struct definition 
begin 
char elem(SetSize) (AVLength + Wis 
“7* array for holding set elements */ 
int no_elen; 
/7* number of elements in set * / 
end set; 


beg 


NLength + 1 
3 + Mpmewe 6 (5): 


PELE ’*fopen(), *twel_fp; 
/* Get name of set file */ 


* Open set file * 
/* Accept elements hn set */ 


while ( set is not ful 


begin 

/* Enter a value for the set*y/s. 

/* Verify set Eee | against attribute type */ 
Ea 7* Check for set element duplication */ 
n 
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if ( set is_ full) 
/* Tell user*/ 

/* Write set elements to set file */. 
7* Write end_of_file symbol to set file */ 
/* Close set file */ : 
7* Ask if user wants +o modify it */ 
if ( answer= ie ) 

mod set(va type); 

end /* end gen_set */ 


mod_set (val_type) ae 
* This routine modifies a set */ 
/* of values for an attribute. */ 


begi 
Sena: ofn (MFNLéength + 2 /* old-file name */ 
nin (MFN anes + 1), /* néew-file name */ 
NLength + 1: 


char c, answer entry(AVLength + 1), index(5); 


oat, il, Ky Saale 


struct 
begin 
athe no_elem; /* number of elements in the Set */ 
char EN ear ese act S /* element removed flag */ 
ener ,etem(Se ize) (AVLength + 1); /* elements */ 
end set; 


FILE *fopen(), *set_fp; 


/* Get the name cf the set tc be modified */ 
* Open file * 


/* Read eee file into array for manipulation */ 

while ( TRUE ) 

begin 

/* Ask what de you want tc perform next?*/ |... 
Pp - print the set elements and their indic2s 
a - add some elements to the set 
r - remove some elements from the set 
n) - nothing; done 


ar. ¢{ Bes wee po 

begin 

/* Print elements of file */ 
endy* end ( answer = ‘p't * / 
else if ( answer = fa'* ) 


begin 
az: Add some elements */ : : 
/* Check for set element duplication */ 
/* paper entry against */ 
u 


/* attribute value tyre */ 

* Add element to array if correct*/ 
end ./* end ( answer = ‘at ) */ 
else if ( answer = ‘rf ) 
begin 


7* Remove some elements */ 
7* Mark set elements for_removal */ 
/7* Re-crder array to reflect deletions */ 
Sue 7* end { answer = ‘rf ) ¥*/ 
alse 
Ge Nothing; done */. 
reak; ‘40 exit while */ 
end /* end while (TRUE) */ 
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/* Ask if user wants to store the modified set back 
into the original file * 
/* Arite array back into file designated*/ 
/* Write end_of_file symbol to set file */ 
/* Close set file */ 
ends* end mod_set */ 


gen_rec (} : : 
7* This routine generates records using sets. */ 
begin 
Siar C > 
char oes (rong ta + aes 
Char attr_name(AVLength + 1); 


char dbid(DBIDLNTH + 1) 
r_ records (MAX RECORDS) (MRLength + 1); 
enact fn (MFNLength * 1), /* template-file name */ 
rin(MFNLength + 1), /* record-file name */ 
vin (MFNLength + 1); /* temporary file name */ 


struct 
begin 

int no_elem(MAX_FIELDS) ; . 

char eélems ( AX FIELDS) (SetSize) (AVLength oe hc: 
end values; 


Bee *Lopen (), *tupl fp, “*rec fp, *stor_f£p; 


int no_attr, k, i, j, count, gr_no_rec, max, 
mOGEGHt, PLCd, index, old; 


/* Get the template-file name */ 
/* Oren template file * 

/* Get file for record storage */ 

/* Open record file */ 

/* Read datakase ID */ 

/* Write database ID to storage file */ 

/* Read number of attributes in a record */ 
/* Read elements of files ccrresponding to */ 
* each attribute into an array */ 
for ( each attribute) 


Read the attribute name */ | 
/* Get the file name for the given attribute */ 
/7* Open file * 
/* Read elements 9f set into array */ 
/* Close file * 
end /* end for */. 
f* Close template file ¥*/ 


/* Calculate total Ssible number of tnique records */ 


/* Get the number of records to be generated * 
/7* Determine feasibility of requestéd number */ 
7* Generate records by choosing (at random) */ 
7* a member from each of the given sets */ 
for ({ €ach record) 


begin : 
poe Ot each attribute) 
eqin 
/* Get a value randomly from the set*/ 
en 


/* Give some feedback tc_user cf generation effort*/ 
/* Check generated record for possible duplication */ 


end 

y* Write generated records te file */ 
7* Write end_of file symbol to file */ 
/* Let user Know when completed*/ 


Sue, 





/* Close file */ 
end/y* end gen_rec */ 


ment gr_isdigit (c) 


/* This routine determines whether a given */ 
/* Character 3S a digit .*/ 
endl Cc; 

begin 


if (cis a digit ) 
return (TRUE) ; 


return (FALSE); 
end 


gs_rand (nun) 
/* This routine cenerates a random number */ 
int num; 
begin ; 
Static long seed 
Scactic Tpt itemp. 


seed * 2429 
seedmod 1990 


return (temp) ; 


se 
return (temp mod nun); 


end 
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APPENDIX B 
DESIGN SPECIFICATION OF THE DATABASE LOAD SUBSYSTEMS. 


This appendix contains the design of the shortened dat:- 
base load subsystem. The design consists of C language code 
for the functicn headings and their corresponding declara- 
tions. The body of the functicns are given in English text. 


J EERE EET EK 


pes */ 
/* Database Load */ 
ten Design 7, 


/ 
JERR TT RC OK 


struct rtemp_definiticn template; 


db_ load () , 
/* This routine loads the directory tables and the database */ 
* records. * / 


egin 
7 /* Initialize counters*/ 
7* load the directory tables */ 
dbl Se ree 
/* Toad the database records */ 
4 dbl_ records (); 
en 


dbl_dir_tbls() _ 
: 7/* This routine loads the directory tables. */ 
egin 
e char dbid(DBIDLNTH + 1), 

attrname(ANLength + 1), 
tin(MFNLength + 1), /* 
dfn (MFNLength + 1), /* 
valtype, 
Str ( 
attrstr(DIL_Attrid +1), 
desctype; 


plate-file name */ 
Criptor-file name */ 


mor at lid no, desc_id no; 

Struct desc_definition descriptor; 

am eK» Ct 

EEL *tfopen (), ‘*fptr; 

/* Initialize the database mailbox */ 

7* Get the name cf the file containing */ 


/* the template information */ 
7* Read the datakase id */ 
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/* Read number of entries in the template, i.e.,*/ 
7* number of attributes in a record * 
/* Read the attribute names and the value ee YL 
/* and place the data | in the template recor 
for ( each attritute to be put in template) 


/* Read an attribute *, 
4 /* Read the corresponding value type */ 
en 
/* Create attritute table fer the database in backends */ 
DBL SeCreaté (dbz al) & 
/* Get the name cf'the file SOEs eee the descriptors */ 
/* Read the directory attributes and their * 
/* Eni tialize os descriptors * / 
7* Initialize +he attribute counter */ 
while not the end of data ) 
begin 
- Read an attribute? */ 
7* Read corresponding descriptor typ 
¢3 Add the attribute name to the attri 
Bu SSAtm_ insert (dbid, Beers Beeed SSc 
ea = ‘ct | descty ; 
end the at tribute: Se “Trio” x / 
OBL SS$send ee Peg ao PeSee pe = sie (Pap ve) ) 
/* Using the temp Tat fhe value */~ 
7* type for the sttrib ute 37 
/* Read the corresponding descrip*tors*/ 
/* for the attribute */ 
/* Tnititialize the descriptor id */ 
while ( More descriptors ) 


begin 
/* Get lower bound */ 
/* Get ae per beund */ 
/* Add 2 Se Cae rse 2 2Oo DDIT */ 
DBL_S$Des¢_ See (ee d,attrname &Edesctype, 
tdescriptor,é&valtype, at ia no,deSc_id_no); 
7* Increment the dese ri ptor id count ¥ 
end /* end while */ 
si ( desctype != C ) 
in 
“S75 Add the catchall descriptor to DDIT */ 
DBL_S$Céetchall (dbid,attrname 
aes at_id _no, desc_ id 0); 
end /* end il 
/* Increment the attribute count */ 
Pe. * end while */ 
/* ose descri tor file *, 
end/* eid ahaa. <b s */ 


dbl_records() 
begin 
char dbid (DBIDLNTH + 1), 
arias | MFNLength + 4 /* rvecord-fils name */ 
req (RE Length) ’ 
record (80 
struct rtemp_ efinition cup? eis = 
- . *Ge _tapl_ptr() ; 
mit il, C3; 
FILE *fopen(), *fptr; 
/* Get the name cf the file is 
/* Selma yoe se the records to be loaded */ 
a 


/* Read the takase id */ 
/* Get the record template for the database */ 
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while ( more recerds exist ) 


begin 
/f* While tkere are more records cy 
/* Read the next one 
/* Construct a request to acer oe record */ 
dbl construct “ins | migeeOtt, record, reg ); 
* Send the réquest te Re quést-Preparation * / 
; BL_S$TrafUnit(dbid, req) ; 
en 


end /* end drkl_records */ 


dbl _ construct_ins(tmpl_ptr, reccrd, req) 


struct srten Sete *tmpl ptr; 
char req ( reccrd (); 

begin 
Pitcet, te Ke P, GNtry_no; 


/* Load the initial part of request */ 
while ( not the end of the record ) 


begin 
* Load the attribute name */ 
/* Load the attribute value */ 


d 
3’ Load the end of reguest */ 


57 





APPENDIX C 


DESIGN SPECIFICATION OF THE TEST REQUEST GENERATION PACKAGE 


The seognad specification for the test request genera- 
tion and execution package is shewn in this appendix. This 
design is the result of the work of Dr. Kerr, who headed the 
design cf the original test request generation package. 


The Top Level cf Test Request Generation Package 





This program can be used te test and demonstrate MDBS. 
The execution cf this prcar is called a session. Bach 
session can be divided into any number of subsessions. 
During a subsessicn the user can do one of the following: 

(A) Execute a list of reguests that was previously 
stored ina file. 

(B) Prompt the user for a list of requests to be 
stored ina file for later use. 

(C) Retrieve a list cf requests that were previ- 
ously stored in a file and then allow the user to 
select requests frem that list for e¢xecution. 
This selection can be done in any order. The user 
will also te able to enter a new request to be 
executed. 

(D) Modify an existing list of requests that was 
previously stored in a file. 

In this version, requests are allowed to be grouped as 
transacticns. A request is sent -to MDBS. The program waits 
for a response before sending the next reguest or will 
continue to execute without response if the user so desires. 

Output may be directed to the user's terminal or toa 
Pole or tc both. 
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J PRK KERR EAI KEK EK / 
/* */ 
/* Test Request yf 
/* Generation 

oN Package Design “7 


Teck ek ea EK ES 

task SPE Test. 

scalar more-subsessions; /* flag: TRUE - continue, 
Balok - Scop s*/ 


Print initial message to user; 
more-suksessions := TRUE; 
while more-subsessions do 
perform SUBSES SION $ 
Prompt for continue message; 
Read continue message; 
aie user dces not want to continue 


hen . 
_ more-subsessions := FALSE; 
end if. 
end while ; 
end task ; 


procedure SUBSESSION; 


Ze During a IEG ee ene user == able 


o generate 14°38 requests. en Lest) 
ye to modify an “o d rest of requests. (MODIFY) 
/* to select requests, cne at a time “from a list 
/* of requests. (SELECT) 

/* to run a qroup of requests. (OLD__LIST) 


scalar current-request-file; y* The name of the file */ 
f* Initial value should be NULL. This name must be 
/* retained from one subsession to the next. 
scalar rteo f-subsession; /* Possible values are NEW LIST, 
ODI SELECT and OLD__IIST */ 


Prompt for next type-o f-subsession; 
Read next type-of-subsession; 
case _ EPS e-of-subsession value 
NEW T: /* Enter a new Benue Ste list */ 
perform NEW LIST B( current-request-file) ; 
MODIFY: /* so A "Od test. ee / 
perform DEY SSUBC current—-regquest-file ); 
SELECT: 7% Select requéSts, one at a time, from an */ 
/* existin request- list */ 
erfornm SELECT__SUB( current-request- ~file eS 
OLD__LIST: /* Execute an existing request-list 
perform OLD__LIST  SUB( cCurrent-reguest- Ailey 
otherwise : Print errcr message; 
end case ; 
end procedure ; 


procedure Niel rc. SUB ( CUCpUE 3) Curren t— BORNE On ~file ) 
scalar current-Téquest-file; /* name of the file */ 


7* Asks user for requests + cne at a ti 


7/* Saves list of requests in a file with file-name given by 
/* user. 


scalar request-list-file-name; 

/* of file tc use to store the reguests */ 
record request; 
scalar next-step; 


oe, 





/* I(nsert), R(etrieve), U(pdate), D(elete) or F(inish) */ 


Prompt for reguest-list-file-name; 
Read reguest-list-file-name; 
Open fi Bi Dues caters | cane } Ouch UT:> 
erform. ENTER__AND_SAVE__REQUESTS ( request-list-file-name ); 
lose file( request -Iist-file-name )i ; 
current-request-file :;= request-list-file-name; 
end procedure ; 


0 


procedure MODIFY__SUB ( Zaput/cutput : current-request-file ); 
scalar currént-request-file; /* The name of the file */ 


/* Retrieve an cld request-list and then allow the user to */ 
7* nodify 1t. Requests are examined one at a time allowing */ 
a changes to te made to each request in turn. A change ey 
can te 
4% add new request before this one. */ 
/* modify this request. */ 
i= remove this request. . */ 
/* make no changes to this request. */ 
7* Note that we must have a way to append new requests at */ 
ie the end of the input request list. Wy 
7* The input file ( called input-request-file ) may be . */ 
/7* either the current-request-filée or a different existing */ 
Ys reguest file. oy 
/* The output file (called néw-request-file ) may he */ 
7* elther the next version of the input-request-file or 3 */ 
/* new file. */ 
scalar input-request-file; /* The list Se leo 
to be modified. */ 


scalar new-request~file; /* The new list of requests. */ 
scalar next-versicn; /* flag:TRUE-set new-r2quest-file to */ 

/*next version of input-request-file, FALSE-get new name*/ 
record request; 


scalar nore ~requests~in ~input-request=file;/*continue flag*/ 
scalar mere-requests-to-enter; /* ccntinuation flag * 

scalar Change-type; /* ADD, MODIFY, REMOVE, or NOCHANGE * / 
scalar next-step; 


/* TI(nsert), Retrieve), U(pdate), D(elete) or F(inish) */ 


/* Determine input-request-file to be modified. */ 
perform DET ee ee Gaerne Fequest-rile, 
input-regquest-file ); 
open file( input-regquest-filé ) input; 
/*Datermine if user wants the name of the new-reg file*/ 
/* to be the next version of the input-request-fi 
7* or a new name.* ; 
Prompt user to. determine next-versicn; 
Read next-version; 
a hext-version 
then 
Set new-request-file tc next version of 
input-reguest-file; 


uest- 
le*/ 


else : 
Frompt for new-request-file name; 
; Read name of new-request-file; 
en - 6 ab ’ 6 ° 
open tile( new-reguest-file ) output; 


Read first request from input-request-file; 
more-requests-in-input-request-file := TRUE; 
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while more-requests-in-infut-re quest-file do 
Prompt user for change-tybde for this request; 
Read change-type; 
case Change-type value 
ADD: /* enter and Save the next request */ 
perform GET__NEW_ REQUEST ( request ); 
ie request intc néw- request-file; 
MO : 
Prompt and get modified request from user; 
Write new request into new-request-file; 
Read next request from input-request-file; 
REMOVES 
Read next request from input-request-file; 
NO_CHANGE: ; 
WELte curl ent Soehes into new-reques=-file; 
Read next request from input-requéest-file; 
otherwise : Print system error message; 
and case 3 
end while ; 


/* Note that at this point all the cld requests have been */ 
/* SOs oe However it 1S possible that the user wants */ 
/* tO append more requests. "7 


Prompt user that input file has been processed, but that 
more requests may still ke a Bue deed: 
per form ENTER__AND__ SAVE__ REQUE STS (new-r2equest-fils) ; 
close Eile( input-requéest-f£1if¢4 ) ; 
clos2 file( new-requeéest-file ); . 
current-regquest-file := new-request -file; 


end procedure ; 


procedure SELeelmcoUn(inpur/Cucput ys: Current-request-file ) ; 
scalar current-request-file; /* The name of the file */ 


/* Retrieve an old list of requests, * / 
/* Allow user to select from this list. af, 
7* Also allow user +o enter new request. * / 


scalar input-request-file; /* file ccntaining requests*/ 
array xcequests( MAX NUMBER __CF_ REQUESTS ); 
/* frcem input-reguest-file */ 
scalar hnumber-of-requests; /* The actual number in */ 
7* indut-regquest-file must be less than * 
/* MAX__NUMBER__CF__ REQUESTS */ 
scalar request-number; /* of the request chosen */ 
record hew-request; /* Provided by user. */ 
record response; /* to the request being executed. */ 


scalar mozre-tce-execute; /* flag to control loop */ 
scalar next-operation; 
/* Values can be REQUEST NUMBER, DISPLAY,*/ 
JZ NEW. REQUEST or STOP x / 


/* Determine the new input-regquest-file to use for */ 
/* this subsession. * 
perrorm ee eee cube ents request-file, 
input-reguest-file ) ; 
Open( input-request-file );3 ; 
Read and store input-request-file into requests checking that 
humber-of-requests is iess than MAX__NUMBER_ OF__REQUESTS; 
close { input-réeguest-file ); 
perfcrn DISPLAY ( requests ); 


/*Determine whether ees omega @ucOmenn, file or both */ 
per form OUTMSFORMAT; 
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more-to-execute := TRUE; 


while more-to-execute do . | 
Prompt user fer next-operation /*should be either*/ 


/* request-number, a request-to-display ora * / 
/* NRew-request |. * / 
Read next-operation;. 

case next-oreration va lue 


REQUEST NUMBER: : 
ChecK that request-number is l2ss than 
num ber-o f-requests; 
perform EXECUTE{ requests (request-number) , 
response );3 
/* Output the response to CRT, file or CRT_&file, 
as a pP ae riate. * 


perform RESPONSE( response ); 
DISPLAY: Peo en DISPLAY( requests );3 
NEW__ REQUEST: 

perform GET NEW_ REQUEST( new-request ); 

erforn EX ECUTE( new-request, res SER Ne 
7* Output the response to CRT, file or CRT_&file, 


as epee riate. */ 
perform OQUTMSHESPONSE( response ) ; 


STOP; more~to-execute := FALSE; 
otherwise : print error message; 
end _.case 5; 
end while ; 


per forn OUTMSFINISH;. : 
current-request-file := input-request-file; 


end procedure ; 


procedure OF LIST (SUB ( SO yeguest. tte es 
scalar current-réequest-file; /* The name of the file */ 


/* Retrieve and execute an old list of requests. */ 


scalar input-request-file /* The file containing requests*/ 
record request; 
record response; /* tc a request that has been executed. */ 


/* Determine the new current-request-file to use for this*/ 
7* subsession. a ; 
perforn DETE cee ene eee uereatyrequest-tile, 
; input-reguest-file ) ; 
Open( input-request-file ) infut; 
Read first request from input-request-file; 


/* Determine whether response is to go to CRT, file or both. 
perforn OUTMSFORMAT 
while moreé-requests do 
perforn EXECUTE ( request, response ) ; 
/f* Cutput the response to CRT, file or CRT_&file, as */ 
/* appropriate. * 
perform CUT MSRESPONSE( response );. 
Read next request from input-re quest-file; 
end while ; 


per form OUTMSFINISH; 
close ( input-request-fil 
current-request-file := 


7 
( 


2)3 
input-regquest-filse; 


end procedure =; 
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procedure ENTER__AND__SAVE__ REQUESTS 
input : reguest-list-filée- name ) ; 
scalar request-list-file-name; 
/* of file tc use to stcre the requests */ 
record request; 
scalar Bee ED: se 
/* I(nsert), Retrieve), U(pdate), D(elete) or F(inish) */ 


next-step := IT; 
while BoA eD O= F do 
Prompt for next-step; 
case next-step value 
I: /* enter and save the next insert request */ 
perform INSERT  SUEP( request );_. 
Write requést into request-list-file-name ; 
R: /*enter and save next retrieve request */ 
perform RETRIEVE. SUB( request ); 
Write request into réquest-list-file-name ; 
U: /* enter and save the next updates request */ 
perform DEGEDE  -SUB( request ); 
Write request into Fequest-list-file-name ; 
D: /* enter and save the next deletes request */ 
perform DEIETE__SUB( request ) ; 
Write request into request- ist-file-namne ; 
rs Ye 8banssa eer requests */ 
otherwise : Print error message; 
end case 3; 
end while ; 
end procedure ; 


procedure DETERMINE INPUT__FILE( input :_. 
current-request-file, | 
output (3; input-request-file ); 
scalar current-reguest- file; 
scalar input-requést -file; 
/* Determine the input file tc be_yused. It may be either */ 
7* the current-request-file or a different existing * 


7* request file. */ 


scalar nodify~current-file flag; , 
7* THUE ~ select new input file */ 


et current-request-file is NULL 
then 


Prompt for name of input-request-file; 
Read name of input-request-file; 
else /* Determine if user wants to use the */ 

Fee CULESNt=request-file er a different old file. */Y 
Prompt user to detergrine mnodify-current-file- flag; 
Read we GN as Ce eae 

ake Reo ec cca tet ag 
en 
Prompt for name of input-reguest-file; 
quses name of input- request-file; 

else 

input-request-file := current-request-file; 
_. end teats 
end if ; 
end procedure =; 


procedure GET__NEW__ REQUEST ( Dueigue s request ); 
record Tequest; /* to be cbtained from user */ 
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/* Prompts user for information necessary to enter a */ 
/* new request. Returns the request. */ 


scalar request-t ype; 
/* I(nsert), Retrieve), U(pdate) or Delete) */ 


case ee ueoe ype value 
ie perform SERT SUB( request ); 
Us perforn UPDATE _SUB( request ) 3; 
De perform DELETE SUB{ request );3 
R: perform RETRIEVE __SUB( request ); 

otherwise : Print error message; 
end case ; 
end procedure 3 


procedure DISPLAY ( input : requests ); 
/* Display the requests and their numbers at the 7 


/* terminal. 


REQUESTS +); 


array requests ( MAX __NUMBER_ OF_ 
(ag 8 ayed. F/ 


o bS displ 


end procedure ; 


procedure EXECUTE ( input : request, 
output 3: response );3 
/* Ask MUBS te execute this request. Return the response. */ 


record request; /* to he executed */ 
record reérenses /* to the execution of the request */ 


end procedure ;3 
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